Fix 2 embeddings-related issues in server.cpp #324

k8si · 2024-04-05T22:45:37Z

Misc embeddings fixes:

Fix Llama.cpp server embeddings always return a 0 filled vector. #303 - /embeddings server endpoint returns actual embeddings rather than 0-vector
Fix Support BERT architecture in llamafile #322 - Pull in upstream change from llama.cpp to improve locale handling during lowercase. (This was what was causing an error when using a BERT model with the /embeddings endpoint). Now efficient sentence embedding models like all-MiniLM-L6-v2 should work without a problem.

TODO:

Still need to test calls to the /embeddings endpoint with a batch of texts, thus far I have only tested with single texts (will finish this on Monday)

…n the BERT tokenizer (#322)

jart

This fix looks great. I've tested and confirmed it works locally using your script. I think this is worth merging. Please send me the tests on Monday. Then we'll close out the issue.

k8si added 2 commits April 5, 2024 18:18

fix embeddings 0-vector issue (#303)

3d5da90

Pull in upstream changes from llama.cpp related to unicode handling i…

0f3c1fe

…n the BERT tokenizer (#322)

k8si requested a review from jart April 5, 2024 22:45

jart marked this pull request as ready for review April 6, 2024 06:51

jart approved these changes Apr 6, 2024

View reviewed changes

jart merged commit 956fb58 into main Apr 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix 2 embeddings-related issues in server.cpp #324

Fix 2 embeddings-related issues in server.cpp #324

Uh oh!

k8si commented Apr 5, 2024

Uh oh!

jart left a comment

Uh oh!

Uh oh!

Fix 2 embeddings-related issues in server.cpp #324

Fix 2 embeddings-related issues in server.cpp #324

Uh oh!

Conversation

k8si commented Apr 5, 2024

Uh oh!

jart left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!